A vector space model approach to identify genetically related diseases

نویسنده

  • Indra Neil Sarkar
چکیده

OBJECTIVE The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models. MATERIALS AND METHODS A vector space model approach was developed that bridges gene disease knowledge inferred across three knowledge bases: Online Mendelian Inheritance in Man, GenBank, and Medline. The approach was then used to identify potentially related diseases for two target diseases: Alzheimer disease and Prader-Willi Syndrome. RESULTS In the case of both Alzheimer Disease and Prader-Willi Syndrome, a set of plausible diseases were identified that may warrant further exploration. DISCUSSION This study furthers seminal work by Swanson, et al. that demonstrated the potential for mining literature for putative correlations. Using a vector space modeling approach, information from both biomedical literature and genomic resources (like GenBank) can be combined towards identification of putative correlations of interest. To this end, the relevance of the predicted diseases of interest in this study using the vector space modeling approach were validated based on supporting literature. CONCLUSION The results of this study suggest that a vector space model approach may be a useful means to identify potential relationships between complex diseases, and thereby enable the coordination of gene-based findings across multiple complex diseases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements

Financial statement fraud has increasingly become a serious problem for business, government, and investors. In fact, this threatens the reliability of capital markets, corporate heads, and even the audit profession. Auditors in particular face their apparent inability to detect large-scale fraud, and there are various ways to identify this problem. In order to identify this problem, the majori...

متن کامل

Retracted: Using genetic algorithm approach to solve a multi-product EPQ model with defective items, rework, and constrained space

The Economic Production Quantity (EPQ) model is often used in the manufacturing sector to assist firms in determining the optimal production lot size that minimizes overall production-inventory costs. There are some assumptions in the EPQ model that restrict this model for real-world applications. Some of these assumptions are (1) infinite space of warehouse, (2) all of the pr...

متن کامل

On the character space of vector-valued Lipschitz algebras

We show that the character space of the vector-valued Lipschitz algebra $Lip^{alpha}(X, E)$ of order $alpha$ is homeomorphic to the cartesian product $Xtimes M_E$ in the product topology, where $X$ is a compact metric space and $E$ is a unital commutative Banach algebra. We also characterize the form of each character on $Lip^{alpha}(X, E)$. By appealing to the injective tensor product, we the...

متن کامل

Vector Space semi-Cayley Graphs

The original aim of this paper is to construct a graph associated to a vector space. By inspiration of the classical definition for the Cayley graph related to a group we define Cayley graph of a vector space. The vector space Cayley graph ${rm Cay(mathcal{V},S)}$ is a graph with the vertex set the whole vectors of the vector space $mathcal{V}$ and two vectors $v_1,v_2$ join by an edge whenever...

متن کامل

An Efficient Predictive Model for Probability of Genetic Diseases Transmission Using a Combined Model

In this article, a new combined approach of a decision tree and clustering is presented to predict the transmission of genetic diseases. In this article, the performance of these algorithms is compared for more accurate prediction of disease transmission under the same condition and based on a series of measures like the positive predictive value, negative predictive value, accuracy, sensitivit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 19 2  شماره 

صفحات  -

تاریخ انتشار 2012